Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
نویسندگان
چکیده
Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train powerful language models with self-supervised objectives, such as Masked Language Model (MLM). Based on a pilot study, we observe three issues of existing general-purpose when they are applied the text-to-SQL semantic parsers: fail detect column mentions utterances, infer from cell values, and compose target SQL queries complex. To mitigate these issues, present model pretraining framework, Generation-Augmented Pre-training (GAP), that jointly learns natural utterance table schemas, generation generate high-quality pre-train data. GAP is trained 2 million utterance-schema pairs 30K utterance-schema-SQL triples, whose utterances generated models. experimental results, neural parsers leverage representation encoder obtain new state-of-the-art results both Spider Criteria-to-SQL benchmarks.
منابع مشابه
Learning Structured Natural Language Representations for Semantic Parsing
We introduce a neural semantic parser which converts natural language utterances to intermediate representations in the form of predicate-argument structures, which are induced with a transition system and subsequently mapped to target domains. The semantic parser is trained end-to-end using annotated logical forms or their denotations. We achieve the state of the art on SPADES and GRAPHQUESTIO...
متن کاملLearning for Semantic Parsing
Semantic parsing is the task of mapping a natural language sentence into a complete, formal meaning representation. Over the past decade, we have developed a number of machine learning methods for inducing semantic parsers by training on a corpus of sentences paired with their meaning representations in a specified formal language. We have demonstrated these methods on the automated constructio...
متن کاملMultilingual Semantic Parsing : Parsing Multiple Languages into Semantic Representations
We consider multilingual semantic parsing – the task of simultaneously parsing semantically equivalent sentences from multiple different languages into their corresponding formal semantic representations. Our model is built on top of the hybrid tree semantic parsing framework, where natural language sentences and their corresponding semantics are assumed to be generated jointly from an underlyi...
متن کاملLearning Discrete Representations via Information Maximizing Self-Augmented Training
Learning discrete representations of data is a central machine learning task because of the compactness of the representations and ease of interpretation. The task includes clustering and hash learning as special cases. Deep neural networks are promising to be used because they can model the non-linearity of data and scale to large datasets. However, their model complexity is huge, and therefor...
متن کاملLearning Discrete Representations via Information Maximizing Self-Augmented Training
Our method is related to denoising auto-encoders (Vincent et al., 2008). Auto-encoders maximize a lower bound of mutual information (Cover & Thomas, 2012) between inputs and their hidden representations (Vincent et al., 2008), while the denoising mechanism regularizes the auto-encoders to be locally invariant. However, such a regularization does not necessarily impose the invariance on the hidd...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2021
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v35i15.17627